Overview

Dataset statistics

Number of variables19
Number of observations8892
Missing cells2462
Missing cells (%)1.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.2 MiB
Average record size in memory613.1 B

Variable types

CAT10
NUM7
BOOL2

Warnings

periodtype has constant value "8892" Constant
pertypdesc has constant value "8892" Constant
period has constant value "8892" Constant
areaname has a high cardinality: 73 distinct values High cardinality
area is highly correlated with areatypeHigh correlation
areatype is highly correlated with areaHigh correlation
population is highly correlated with stfipsHigh correlation
stfips is highly correlated with populationHigh correlation
statename is highly correlated with stateabbrv and 4 other fieldsHigh correlation
stateabbrv is highly correlated with statename and 4 other fieldsHigh correlation
stfips is highly correlated with stateabbrv and 4 other fieldsHigh correlation
areatyname is highly correlated with stateabbrv and 4 other fieldsHigh correlation
areaname is highly correlated with stateabbrv and 4 other fieldsHigh correlation
areatype is highly correlated with stateabbrv and 4 other fieldsHigh correlation
incsource is highly correlated with incdesc and 1 other fieldsHigh correlation
incdesc is highly correlated with incsource and 1 other fieldsHigh correlation
incsrcdesc is highly correlated with incdesc and 1 other fieldsHigh correlation
incrank has 390 (4.4%) missing values Missing
population has 390 (4.4%) missing values Missing
releasedate has 1682 (18.9%) missing values Missing
area has 448 (5.0%) zeros Zeros
incrank has 211 (2.4%) zeros Zeros

Reproduction

Analysis started2020-12-12 20:11:30.962036
Analysis finished2020-12-12 20:11:37.403580
Duration6.44 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

stateabbrv
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
CO
8671 
US
 
221
ValueCountFrequency (%) 
CO867197.5%
 
US2212.5%
 
2020-12-12T15:11:37.455124image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:37.493157image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:37.533192image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
C867148.8%
 
O867148.8%
 
U2211.2%
 
S2211.2%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter17784100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C867148.8%
 
O867148.8%
 
U2211.2%
 
S2211.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin17784100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
C867148.8%
 
O867148.8%
 
U2211.2%
 
S2211.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII17784100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
C867148.8%
 
O867148.8%
 
U2211.2%
 
S2211.2%
 

statename
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
Colorado
8671 
U.S.A.
 
221
ValueCountFrequency (%) 
Colorado867197.5%
 
U.S.A.2212.5%
 
2020-12-12T15:11:37.589741image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:37.626772image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:37.672311image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length7.950292398
Min length6

Overview of Unicode Properties

Unique unicode characters10
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o2601336.8%
 
C867112.3%
 
l867112.3%
 
r867112.3%
 
a867112.3%
 
d867112.3%
 
.6630.9%
 
U2210.3%
 
S2210.3%
 
A2210.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter6069785.9%
 
Uppercase Letter933413.2%
 
Other Punctuation6630.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C867192.9%
 
U2212.4%
 
S2212.4%
 
A2212.4%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.663100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o2601342.9%
 
l867114.3%
 
r867114.3%
 
a867114.3%
 
d867114.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin7003199.1%
 
Common6630.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o2601337.1%
 
C867112.4%
 
l867112.4%
 
r867112.4%
 
a867112.4%
 
d867112.4%
 
U2210.3%
 
S2210.3%
 
A2210.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
.663100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII70694100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o2601336.8%
 
C867112.3%
 
l867112.3%
 
r867112.3%
 
a867112.3%
 
d867112.3%
 
.6630.9%
 
U2210.3%
 
S2210.3%
 
A2210.3%
 

stfips
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
8
8671 
0
 
221
ValueCountFrequency (%) 
8867197.5%
 
02212.5%
 
2020-12-12T15:11:37.734865image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:37.772398image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:37.812432image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
8867197.5%
 
02212.5%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8892100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
8867197.5%
 
02212.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8892100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
8867197.5%
 
02212.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8892100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
8867197.5%
 
02212.5%
 

areatyname
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
County
7800 
Metropolitan Statistical Area
 
644
State
 
227
United States
 
221
ValueCountFrequency (%) 
County780087.7%
 
Metropolitan Statistical Area6447.2%
 
State2272.6%
 
United States2212.5%
 
2020-12-12T15:11:37.874986image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:37.920025image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:37.975572image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length29
Median length6
Mean length7.814215025
Min length5

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t1213717.5%
 
o908813.1%
 
n866512.5%
 
C780011.2%
 
u780011.2%
 
y780011.2%
 
a30244.4%
 
i21533.1%
 
e19572.8%
 
15092.2%
 
r12881.9%
 
l12881.9%
 
S10921.6%
 
s8651.2%
 
M6440.9%
 
p6440.9%
 
c6440.9%
 
A6440.9%
 
U2210.3%
 
d2210.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter5757482.9%
 
Uppercase Letter1040115.0%
 
Space Separator15092.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C780075.0%
 
S109210.5%
 
M6446.2%
 
A6446.2%
 
U2212.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t1213721.1%
 
o908815.8%
 
n866515.1%
 
u780013.5%
 
y780013.5%
 
a30245.3%
 
i21533.7%
 
e19573.4%
 
r12882.2%
 
l12882.2%
 
s8651.5%
 
p6441.1%
 
c6441.1%
 
d2210.4%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1509100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin6797597.8%
 
Common15092.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t1213717.9%
 
o908813.4%
 
n866512.7%
 
C780011.5%
 
u780011.5%
 
y780011.5%
 
a30244.4%
 
i21533.2%
 
e19572.9%
 
r12881.9%
 
l12881.9%
 
S10921.6%
 
s8651.3%
 
M6440.9%
 
p6440.9%
 
c6440.9%
 
A6440.9%
 
U2210.3%
 
d2210.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
1509100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII69484100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t1213717.5%
 
o908813.1%
 
n866512.5%
 
C780011.2%
 
u780011.2%
 
y780011.2%
 
a30244.4%
 
i21533.1%
 
e19572.8%
 
15092.2%
 
r12881.9%
 
l12881.9%
 
S10921.6%
 
s8651.2%
 
M6440.9%
 
p6440.9%
 
c6440.9%
 
A6440.9%
 
U2210.3%
 
d2210.3%
 

areaname
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct73
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
Colorado
 
227
United States
 
221
Morgan County
 
122
Pueblo County
 
122
Dolores County
 
122
Other values (68)
8078 
ValueCountFrequency (%) 
Colorado2272.6%
 
United States2212.5%
 
Morgan County1221.4%
 
Pueblo County1221.4%
 
Dolores County1221.4%
 
Phillips County1221.4%
 
Grand County1221.4%
 
Mesa County1221.4%
 
Crowley County1221.4%
 
La Plata County1221.4%
 
Hinsdale County1221.4%
 
Summit County1221.4%
 
Washington County1221.4%
 
Montezuma County1221.4%
 
Baca County1221.4%
 
Chaffee County1221.4%
 
Mineral County1221.4%
 
Cheyenne County1221.4%
 
Rio Blanco County1221.4%
 
San Miguel County1221.4%
 
Bent County1221.4%
 
Yuma County1221.4%
 
Garfield County1221.4%
 
Adams County1221.4%
 
Otero County1221.4%
 
Other values (48)563863.4%
 
2020-12-12T15:11:38.056142image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:38.135710image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length25
Median length14
Mean length14.00933423
Min length8

Overview of Unicode Properties

Unique unicode characters46
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o1383911.1%
 
n1275310.2%
 
t109358.8%
 
102238.2%
 
u99988.0%
 
C93097.5%
 
y82586.6%
 
e65385.2%
 
a64285.2%
 
r45853.7%
 
l42793.4%
 
i32952.6%
 
s28452.3%
 
d19061.5%
 
S15671.3%
 
M14981.2%
 
A13461.1%
 
m13041.0%
 
g11600.9%
 
f10900.9%
 
c9460.8%
 
P9460.8%
 
L9160.7%
 
h8540.7%
 
G7940.6%
 
Other values (21)69595.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter9357775.1%
 
Uppercase Letter2049516.5%
 
Space Separator102238.2%
 
Dash Punctuation2760.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C930945.4%
 
S15677.6%
 
M14987.3%
 
A13466.6%
 
P9464.6%
 
L9164.5%
 
G7943.9%
 
B6943.4%
 
D5802.8%
 
J4582.2%
 
E3661.8%
 
R3661.8%
 
H2441.2%
 
K2441.2%
 
O2441.2%
 
W2441.2%
 
U2211.1%
 
F2141.0%
 
T1220.6%
 
Y1220.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o1383914.8%
 
n1275313.6%
 
t1093511.7%
 
u999810.7%
 
y82588.8%
 
e65387.0%
 
a64286.9%
 
r45854.9%
 
l42794.6%
 
i32953.5%
 
s28453.0%
 
d19062.0%
 
m13041.4%
 
g11601.2%
 
f10901.2%
 
c9461.0%
 
h8540.9%
 
k7320.8%
 
w4880.5%
 
p4580.5%
 
b3360.4%
 
v3060.3%
 
j1220.1%
 
z1220.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
10223100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-276100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin11407291.6%
 
Common104998.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o1383912.1%
 
n1275311.2%
 
t109359.6%
 
u99988.8%
 
C93098.2%
 
y82587.2%
 
e65385.7%
 
a64285.6%
 
r45854.0%
 
l42793.8%
 
i32952.9%
 
s28452.5%
 
d19061.7%
 
S15671.4%
 
M14981.3%
 
A13461.2%
 
m13041.1%
 
g11601.0%
 
f10901.0%
 
c9460.8%
 
P9460.8%
 
L9160.8%
 
h8540.7%
 
G7940.7%
 
k7320.6%
 
Other values (19)59515.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
1022397.4%
 
-2762.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII124571100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o1383911.1%
 
n1275310.2%
 
t109358.8%
 
102238.2%
 
u99988.0%
 
C93097.5%
 
y82586.6%
 
e65385.2%
 
a64285.2%
 
r45853.7%
 
l42793.4%
 
i32952.6%
 
s28452.3%
 
d19061.5%
 
S15671.3%
 
M14981.2%
 
A13461.1%
 
m13041.0%
 
g11600.9%
 
f10900.9%
 
c9460.8%
 
P9460.8%
 
L9160.7%
 
h8540.7%
 
G7940.6%
 
Other values (21)69595.6%
 

areatype
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
4
7800 
21
 
644
1
 
227
0
 
221
ValueCountFrequency (%) 
4780087.7%
 
216447.2%
 
12272.6%
 
02212.5%
 
2020-12-12T15:11:38.208273image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:38.254813image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:38.305856image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.072424651
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
4780081.8%
 
18719.1%
 
26446.8%
 
02212.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number9536100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
4780081.8%
 
18719.1%
 
26446.8%
 
02212.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common9536100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
4780081.8%
 
18719.1%
 
26446.8%
 
02212.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII9536100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
4780081.8%
 
18719.1%
 
26446.8%
 
02212.3%
 

area
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct72
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1740.473909
Minimum0
Maximum39380
Zeros448
Zeros (%)5.0%
Memory size69.6 KiB
2020-12-12T15:11:38.377919image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q127
median63
Q3101
95-th percentile19740
Maximum39380
Range39380
Interquartile range (IQR)74

Descriptive statistics

Standard deviation6337.988252
Coefficient of variation (CV)3.641530171
Kurtosis15.78066892
Mean1740.473909
Median Absolute Deviation (MAD)36
Skewness3.961619164
Sum15476294
Variance40170095.08
MonotocityNot monotonic
2020-12-12T15:11:38.460990image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
04485.0%
 
591221.4%
 
111221.4%
 
191221.4%
 
271221.4%
 
351221.4%
 
431221.4%
 
511221.4%
 
671221.4%
 
131221.4%
 
751221.4%
 
831221.4%
 
911221.4%
 
991221.4%
 
1071221.4%
 
1151221.4%
 
31221.4%
 
1211221.4%
 
1131221.4%
 
1051221.4%
 
971221.4%
 
891221.4%
 
811221.4%
 
731221.4%
 
651221.4%
 
Other values (47)551662.0%
 
ValueCountFrequency (%) 
04485.0%
 
11221.4%
 
31221.4%
 
51221.4%
 
71221.4%
 
91221.4%
 
111221.4%
 
131221.4%
 
141141.3%
 
151221.4%
 
ValueCountFrequency (%) 
39380921.0%
 
24540921.0%
 
24300921.0%
 
22660921.0%
 
19740921.0%
 
17820921.0%
 
14500921.0%
 
1251221.4%
 
1231221.4%
 
1211221.4%
 

periodyear
Real number (ℝ≥0)

Distinct89
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1994.100765
Minimum1929
Maximum2017
Zeros0
Zeros (%)0.0%
Memory size69.6 KiB
2020-12-12T15:11:38.548565image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1929
5-th percentile1970
Q11983
median1996
Q32007
95-th percentile2015
Maximum2017
Range88
Interquartile range (IQR)24

Descriptive statistics

Standard deviation15.11504904
Coefficient of variation (CV)0.007579882276
Kurtosis0.3524149794
Mean1994.100765
Median Absolute Deviation (MAD)12
Skewness-0.6190983632
Sum17731544
Variance228.4647075
MonotocityNot monotonic
2020-12-12T15:11:38.623630image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20092162.4%
 
20082162.4%
 
20102162.4%
 
20112152.4%
 
20122152.4%
 
20132152.4%
 
20142152.4%
 
20012122.4%
 
20022122.4%
 
20062122.4%
 
20032122.4%
 
20072122.4%
 
20042122.4%
 
20052122.4%
 
19932112.4%
 
19952112.4%
 
20002112.4%
 
19982112.4%
 
19902112.4%
 
19992112.4%
 
19892112.4%
 
19972112.4%
 
20152012.3%
 
20161972.2%
 
19851481.7%
 
Other values (64)366641.2%
 
ValueCountFrequency (%) 
19294< 0.1%
 
19304< 0.1%
 
19314< 0.1%
 
19324< 0.1%
 
19334< 0.1%
 
19344< 0.1%
 
19354< 0.1%
 
19364< 0.1%
 
19374< 0.1%
 
19384< 0.1%
 
ValueCountFrequency (%) 
20171321.5%
 
20161972.2%
 
20152012.3%
 
20142152.4%
 
20132152.4%
 
20122152.4%
 
20112152.4%
 
20102162.4%
 
20092162.4%
 
20082162.4%
 

periodtype
Boolean

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
1
8892 
ValueCountFrequency (%) 
18892100.0%
 
2020-12-12T15:11:38.673673image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

pertypdesc
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
Annual
8892 
ValueCountFrequency (%) 
Annual8892100.0%
 
2020-12-12T15:11:38.711706image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:38.749738image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:38.787271image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length6
Mean length6
Min length6

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n1778433.3%
 
A889216.7%
 
u889216.7%
 
a889216.7%
 
l889216.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter4446083.3%
 
Uppercase Letter889216.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A8892100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n1778440.0%
 
u889220.0%
 
a889220.0%
 
l889220.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin53352100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n1778433.3%
 
A889216.7%
 
u889216.7%
 
a889216.7%
 
l889216.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII53352100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n1778433.3%
 
A889216.7%
 
u889216.7%
 
a889216.7%
 
l889216.7%
 

period
Boolean

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
0
8892 
ValueCountFrequency (%) 
08892100.0%
 
2020-12-12T15:11:38.824803image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

inctype
Real number (ℝ≥0)

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.782276203
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size69.6 KiB
2020-12-12T15:11:38.855830image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7649449563
Coefficient of variation (CV)0.4291955169
Kurtosis0.8252741754
Mean1.782276203
Median Absolute Deviation (MAD)1
Skewness0.7283468841
Sum15848
Variance0.5851407861
MonotocityNot monotonic
2020-12-12T15:11:38.911878image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
2363640.9%
 
1363640.9%
 
3158817.9%
 
6160.2%
 
5160.2%
 
ValueCountFrequency (%) 
1363640.9%
 
2363640.9%
 
3158817.9%
 
5160.2%
 
6160.2%
 
ValueCountFrequency (%) 
6160.2%
 
5160.2%
 
3158817.9%
 
2363640.9%
 
1363640.9%
 

incdesc
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
Total Personal Income - Bureau of Economic Analysis
3636 
Per Capita Personal Income - Bureau of Economic Analysis
3636 
Median Household Income - United States Census
1588 
Dividends, Interest, and Rent Income - Bureau of Economic Analysis
 
16
Transfer Payments - Bureau of Economic Analysis
 
16
ValueCountFrequency (%) 
Total Personal Income - Bureau of Economic Analysis363640.9%
 
Per Capita Personal Income - Bureau of Economic Analysis363640.9%
 
Median Household Income - United States Census158817.9%
 
Dividends, Interest, and Rent Income - Bureau of Economic Analysis160.2%
 
Transfer Payments - Bureau of Economic Analysis160.2%
 
2020-12-12T15:11:38.978435image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:39.024975image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:39.098539image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length66
Median length51
Mean length52.17139001
Min length46

Overview of Unicode Properties

Unique unicode characters34
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
6430813.9%
 
o448729.7%
 
a360127.8%
 
n356167.7%
 
e351247.6%
 
s282966.1%
 
c234845.1%
 
i214524.6%
 
l198004.3%
 
r182603.9%
 
u177843.8%
 
m161963.5%
 
t121002.6%
 
P109242.4%
 
I88921.9%
 
-88921.9%
 
f73201.6%
 
y73201.6%
 
B73041.6%
 
E73041.6%
 
A73041.6%
 
C52241.1%
 
d48121.0%
 
T36520.8%
 
p36360.8%
 
Other values (9)80201.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter33368871.9%
 
Space Separator6430813.9%
 
Uppercase Letter5698812.3%
 
Dash Punctuation88921.9%
 
Other Punctuation32< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
P1092419.2%
 
I889215.6%
 
B730412.8%
 
E730412.8%
 
A730412.8%
 
C52249.2%
 
T36526.4%
 
M15882.8%
 
H15882.8%
 
U15882.8%
 
S15882.8%
 
D16< 0.1%
 
R16< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o4487213.4%
 
a3601210.8%
 
n3561610.7%
 
e3512410.5%
 
s282968.5%
 
c234847.0%
 
i214526.4%
 
l198005.9%
 
r182605.5%
 
u177845.3%
 
m161964.9%
 
t121003.6%
 
f73202.2%
 
y73202.2%
 
d48121.4%
 
p36361.1%
 
h15880.5%
 
v16< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
64308100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-8892100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,32100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin39067684.2%
 
Common7323215.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o4487211.5%
 
a360129.2%
 
n356169.1%
 
e351249.0%
 
s282967.2%
 
c234846.0%
 
i214525.5%
 
l198005.1%
 
r182604.7%
 
u177844.6%
 
m161964.1%
 
t121003.1%
 
P109242.8%
 
I88922.3%
 
f73201.9%
 
y73201.9%
 
B73041.9%
 
E73041.9%
 
A73041.9%
 
C52241.3%
 
d48121.2%
 
T36520.9%
 
p36360.9%
 
M15880.4%
 
H15880.4%
 
Other values (6)48121.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
6430887.8%
 
-889212.1%
 
,32< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII463908100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
6430813.9%
 
o448729.7%
 
a360127.8%
 
n356167.7%
 
e351247.6%
 
s282966.1%
 
c234845.1%
 
i214524.6%
 
l198004.3%
 
r182603.9%
 
u177843.8%
 
m161963.5%
 
t121002.6%
 
P109242.4%
 
I88921.9%
 
-88921.9%
 
f73201.6%
 
y73201.6%
 
B73041.6%
 
E73041.6%
 
A73041.6%
 
C52241.1%
 
d48121.0%
 
T36520.8%
 
p36360.8%
 
Other values (9)80201.7%
 

incsource
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
3
7304 
1
1588 
ValueCountFrequency (%) 
3730482.1%
 
1158817.9%
 
2020-12-12T15:11:39.166597image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:39.205631image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:39.245165image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
3730482.1%
 
1158817.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8892100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
3730482.1%
 
1158817.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8892100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
3730482.1%
 
1158817.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8892100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
3730482.1%
 
1158817.9%
 

incsrcdesc
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.6 KiB
Bureau of Economic Analysis (BEA)
7304 
Census
1588 
ValueCountFrequency (%) 
Bureau of Economic Analysis (BEA)730482.1%
 
Census158817.9%
 
2020-12-12T15:11:39.303215image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:11:39.342749image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:39.389789image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length33
Median length33
Mean length28.17813765
Min length6

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2921611.7%
 
o219128.7%
 
s177847.1%
 
u161966.5%
 
n161966.5%
 
B146085.8%
 
a146085.8%
 
E146085.8%
 
c146085.8%
 
i146085.8%
 
A146085.8%
 
e88923.5%
 
r73042.9%
 
f73042.9%
 
m73042.9%
 
l73042.9%
 
y73042.9%
 
(73042.9%
 
)73042.9%
 
C15880.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter16132464.4%
 
Uppercase Letter4541218.1%
 
Space Separator2921611.7%
 
Open Punctuation73042.9%
 
Close Punctuation73042.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B1460832.2%
 
E1460832.2%
 
A1460832.2%
 
C15883.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o2191213.6%
 
s1778411.0%
 
u1619610.0%
 
n1619610.0%
 
a146089.1%
 
c146089.1%
 
i146089.1%
 
e88925.5%
 
r73044.5%
 
f73044.5%
 
m73044.5%
 
l73044.5%
 
y73044.5%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
29216100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(7304100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)7304100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin20673682.5%
 
Common4382417.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o2191210.6%
 
s177848.6%
 
u161967.8%
 
n161967.8%
 
B146087.1%
 
a146087.1%
 
E146087.1%
 
c146087.1%
 
i146087.1%
 
A146087.1%
 
e88924.3%
 
r73043.5%
 
f73043.5%
 
m73043.5%
 
l73043.5%
 
y73043.5%
 
C15880.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
2921666.7%
 
(730416.7%
 
)730416.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII250560100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2921611.7%
 
o219128.7%
 
s177847.1%
 
u161966.5%
 
n161966.5%
 
B146085.8%
 
a146085.8%
 
E146085.8%
 
c146085.8%
 
i146085.8%
 
A146085.8%
 
e88923.5%
 
r73042.9%
 
f73042.9%
 
m73042.9%
 
l73042.9%
 
y73042.9%
 
(73042.9%
 
)73042.9%
 
C15880.6%
 

income
Real number (ℝ≥0)

Distinct8223
Distinct (%)92.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.019014603e+10
Minimum0
Maximum1.641355086e+13
Zeros66
Zeros (%)0.7%
Memory size69.6 KiB
2020-12-12T15:11:39.461851image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4635.95
Q121494.75
median45418.5
Q3139012250
95-th percentile8308526750
Maximum1.641355086e+13
Range1.641355086e+13
Interquartile range (IQR)138990755.2

Descriptive statistics

Standard deviation6.143597517e+11
Coefficient of variation (CV)15.28632793
Kurtosis408.4398343
Mean4.019014603e+10
Median Absolute Deviation (MAD)38287.5
Skewness19.40635474
Sum3.573707785e+14
Variance3.774379045e+23
MonotocityNot monotonic
2020-12-12T15:11:39.539919image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0660.7%
 
48694< 0.1%
 
101423< 0.1%
 
213643< 0.1%
 
224913< 0.1%
 
299913< 0.1%
 
240423< 0.1%
 
65043< 0.1%
 
211973< 0.1%
 
120113< 0.1%
 
204033< 0.1%
 
145663< 0.1%
 
36493< 0.1%
 
65203< 0.1%
 
130973< 0.1%
 
63503< 0.1%
 
41163< 0.1%
 
42893< 0.1%
 
39423< 0.1%
 
358733< 0.1%
 
101303< 0.1%
 
108883< 0.1%
 
369313< 0.1%
 
354053< 0.1%
 
73163< 0.1%
 
Other values (8198)875398.4%
 
ValueCountFrequency (%) 
0660.7%
 
3531< 0.1%
 
3561< 0.1%
 
3711< 0.1%
 
3751< 0.1%
 
4021< 0.1%
 
4271< 0.1%
 
4451< 0.1%
 
4721< 0.1%
 
4771< 0.1%
 
ValueCountFrequency (%) 
1.641355086e+131< 0.1%
 
1.5912777e+131< 0.1%
 
1.5547661e+131< 0.1%
 
1.4811388e+131< 0.1%
 
1.406896e+131< 0.1%
 
1.3904485e+131< 0.1%
 
1.3233436e+131< 0.1%
 
1.2492705e+131< 0.1%
 
1.2459613e+131< 0.1%
 
1.2079444e+131< 0.1%
 

incrank
Real number (ℝ≥0)

MISSING
ZEROS

Distinct289
Distinct (%)3.4%
Missing390
Missing (%)4.4%
Infinite0
Infinite (%)0.0%
Mean40.83450953
Minimum0
Maximum353
Zeros211
Zeros (%)2.4%
Memory size69.6 KiB
2020-12-12T15:11:39.620988image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q116
median32
Q350
95-th percentile107.95
Maximum353
Range353
Interquartile range (IQR)34

Descriptive statistics

Standard deviation47.06032359
Coefficient of variation (CV)1.152464524
Kurtosis15.61896045
Mean40.83450953
Median Absolute Deviation (MAD)17
Skewness3.648759633
Sum347175
Variance2214.674057
MonotocityNot monotonic
2020-12-12T15:11:39.700056image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02112.4%
 
221561.8%
 
231421.6%
 
191381.6%
 
201371.5%
 
151331.5%
 
211331.5%
 
181331.5%
 
281301.5%
 
141301.5%
 
131301.5%
 
161301.5%
 
311291.5%
 
101291.5%
 
241271.4%
 
91271.4%
 
11261.4%
 
171261.4%
 
261261.4%
 
301251.4%
 
111231.4%
 
61221.4%
 
291221.4%
 
321221.4%
 
121221.4%
 
Other values (264)517358.2%
 
(Missing)3904.4%
 
ValueCountFrequency (%) 
02112.4%
 
11261.4%
 
21161.3%
 
31161.3%
 
41161.3%
 
51181.3%
 
61221.4%
 
71191.3%
 
81201.3%
 
91271.4%
 
ValueCountFrequency (%) 
3531< 0.1%
 
3524< 0.1%
 
3461< 0.1%
 
3401< 0.1%
 
3391< 0.1%
 
3371< 0.1%
 
3353< 0.1%
 
3341< 0.1%
 
3332< 0.1%
 
3322< 0.1%
 

population
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct3979
Distinct (%)46.8%
Missing390
Missing (%)4.4%
Infinite0
Infinite (%)0.0%
Mean6081007.275
Minimum0
Maximum325719178
Zeros66
Zeros (%)0.7%
Memory size69.6 KiB
2020-12-12T15:11:39.783628image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1181.65
Q15346.75
median13896
Q3112686
95-th percentile1758397.75
Maximum325719178
Range325719178
Interquartile range (IQR)107339.25

Descriptive statistics

Standard deviation37676031.66
Coefficient of variation (CV)6.195689291
Kurtosis44.01002692
Mean6081007.275
Median Absolute Deviation (MAD)10788
Skewness6.622763281
Sum5.170072385e+10
Variance1.419483361e+15
MonotocityNot monotonic
2020-12-12T15:11:39.865199image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0660.7%
 
12664490.1%
 
153460.1%
 
321760.1%
 
79460.1%
 
607060.1%
 
555760.1%
 
573960.1%
 
1213160.1%
 
561350.1%
 
748550.1%
 
829250.1%
 
756750.1%
 
154150.1%
 
360150.1%
 
13220250.1%
 
233450.1%
 
138550.1%
 
22350750.1%
 
25952050.1%
 
688150.1%
 
24740550.1%
 
11482750.1%
 
23374650.1%
 
611850.1%
 
Other values (3954)830593.4%
 
(Missing)3904.4%
 
ValueCountFrequency (%) 
0660.7%
 
2022< 0.1%
 
2042< 0.1%
 
2092< 0.1%
 
2772< 0.1%
 
2902< 0.1%
 
3162< 0.1%
 
3452< 0.1%
 
4032< 0.1%
 
4142< 0.1%
 
ValueCountFrequency (%) 
3257191782< 0.1%
 
3231275132< 0.1%
 
3208966184< 0.1%
 
3188570562< 0.1%
 
3185634562< 0.1%
 
3164975312< 0.1%
 
3164273952< 0.1%
 
3141120782< 0.1%
 
3141026232< 0.1%
 
3117216322< 0.1%
 

releasedate
Real number (ℝ≥0)

MISSING

Distinct8
Distinct (%)0.1%
Missing1682
Missing (%)18.9%
Infinite0
Infinite (%)0.0%
Mean20155621.99
Minimum20140612
Maximum20190131
Zeros0
Zeros (%)0.0%
Memory size69.6 KiB
2020-12-12T15:11:39.934758image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum20140612
5-th percentile20151119
Q120151119
median20151119
Q320161117
95-th percentile20170628
Maximum20190131
Range49519
Interquartile range (IQR)9998

Descriptive statistics

Standard deviation8042.790412
Coefficient of variation (CV)0.0003990345926
Kurtosis5.563195653
Mean20155621.99
Median Absolute Deviation (MAD)0
Skewness1.982201231
Sum1.453220346e+11
Variance64686477.61
MonotocityNot monotonic
2020-12-12T15:11:39.993809image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
20151119441449.6%
 
20161117216824.4%
 
201406121601.8%
 
201808081441.6%
 
201901311281.4%
 
201706281281.4%
 
20160928640.7%
 
201707114< 0.1%
 
(Missing)168218.9%
 
ValueCountFrequency (%) 
201406121601.8%
 
20151119441449.6%
 
20160928640.7%
 
20161117216824.4%
 
201706281281.4%
 
201707114< 0.1%
 
201808081441.6%
 
201901311281.4%
 
ValueCountFrequency (%) 
201901311281.4%
 
201808081441.6%
 
201707114< 0.1%
 
201706281281.4%
 
20161117216824.4%
 
20160928640.7%
 
20151119441449.6%
 
201406121601.8%
 

Interactions

2020-12-12T15:11:33.122896image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.206969image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.283535image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.360101image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.439168image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.517736image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.597805image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.676873image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.749936image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.818995image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.887554image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:33.959116image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.029677image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.100738image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.171299image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.246363image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.315423image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.384482image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.457044image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.532109image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.604171image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.676734image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.755301image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.828364image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.901427image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:34.976491image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.052057image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.128122image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.202686image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.282255image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.356318image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.429882image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.505947image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.582513image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.660580image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.736646image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.814713image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.890278image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:35.962840image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.038906image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.115472image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.190536image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.266101image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.344669image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.416731image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.488292image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.563857image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.640424image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:36.715988image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-12-12T15:11:40.059365image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-12T15:11:40.177467image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-12T15:11:40.295569image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-12T15:11:40.420176image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-12T15:11:40.539779image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-12T15:11:36.896143image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:37.105824image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:37.219922image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:11:37.292985image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

stateabbrvstatenamestfipsareatynameareanameareatypeareaperiodyearperiodtypepertypdescperiodinctypeincdescincsourceincsrcdescincomeincrankpopulationreleasedate
0USU.S.A.0United StatesUnited States0019291Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)851260000000.0121769000.020140612.0
1USU.S.A.0United StatesUnited States0019291Annual02Per Capita Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)6990.0121769000.020140612.0
2USU.S.A.0United StatesUnited States0019301Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)763710000000.0123075000.020140612.0
3USU.S.A.0United StatesUnited States0019301Annual02Per Capita Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)6210.0123075000.020140612.0
4USU.S.A.0United StatesUnited States0019311Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)655070000000.0124038000.020140612.0
5USU.S.A.0United StatesUnited States0019311Annual02Per Capita Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)5280.0124038000.020140612.0
6USU.S.A.0United StatesUnited States0019321Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)501340000000.0124839000.020140612.0
7USU.S.A.0United StatesUnited States0019321Annual02Per Capita Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)4020.0124839000.020140612.0
8USU.S.A.0United StatesUnited States0019331Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)470850000000.0125580000.020140612.0
9USU.S.A.0United StatesUnited States0019331Annual02Per Capita Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)3750.0125580000.020140612.0

Last rows

stateabbrvstatenamestfipsareatynameareanameareatypeareaperiodyearperiodtypepertypdescperiodinctypeincdescincsourceincsrcdescincomeincrankpopulationreleasedate
8882USU.S.A.0United StatesUnited States0020121Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)139044850000000.0314102623.020160928.0
8883COColorado8StateColorado1020091Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)19808246800023.04972195.020160928.0
8884USU.S.A.0United StatesUnited States0019961Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)66616970000000.0269394284.020151119.0
8885USU.S.A.0United StatesUnited States0019941Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)59303160000000.0263125821.020151119.0
8886COColorado8Metropolitan Statistical AreaDenver - Aurora MSA211974020141Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)14868424500017.02754258.020151119.0
8887USU.S.A.0United StatesUnited States0019781Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)18518670000000.0222098244.020151119.0
8888USU.S.A.0United StatesUnited States0019731Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)11312130000000.0211349205.020151119.0
8889USU.S.A.0United StatesUnited States0019861Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)37122430000000.0240132831.020151119.0
8890COColorado8StateColorado1020001Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)14722674400022.04326921.020160928.0
8891USU.S.A.0United StatesUnited States0019881Annual01Total Personal Income - Bureau of Economic Analysis3Bureau of Economic Analysis (BEA)42607530000000.0244499004.020151119.0